-
Notifications
You must be signed in to change notification settings - Fork 689
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add user defined metadata #5915
Conversation
There are a few implementation details I'd like to call out:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR looks good to me overall!
Some missing works could be tracking in follow up:
- Add custom metadata support in multipart upload
- Add custom metadata integration test
- Track the current progress of webdav(http)
- Local fs might support custom metadata by
xattr
Some design decision like naming need reviews from @tustvold and @alamb.
@@ -61,6 +61,7 @@ use std::sync::Arc; | |||
|
|||
const VERSION_HEADER: &str = "x-amz-version-id"; | |||
const SHA256_CHECKSUM: &str = "x-amz-checksum-sha256"; | |||
const USER_DEFINED_METADATA_HEADER_PREFIX: &str = "x-amz-meta-"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: maybe METADATA_HEADER_PREFIX
is clear enough?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Happy to rename to whatever. I chose to be quite verbose because I feel there's a lot of confusion about "metadata". You can see this play out in #4754, where the conversation muddles user-defined metadata and tags. Add in standard (non-user-defined) metadata and it gets even more murky.
Y'all let me know what to rename to, though. :)
@@ -183,6 +183,8 @@ impl Client { | |||
has_content_type = true; | |||
builder.header(CONTENT_TYPE, v.as_ref()) | |||
} | |||
// Ignore metadata attributes | |||
Attribute::Metadata(_) => builder, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
WebDAV handles custom metadata in prop
, which differs from other object storage services. We can create an issue to track this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the meantime I think we should return an error as per https://docs.rs/object_store/latest/object_store/struct.PutOptions.html#structfield.attributes
@@ -642,6 +647,7 @@ impl GetClient for S3Client { | |||
etag_required: false, | |||
last_modified_required: false, | |||
version_header: Some(VERSION_HEADER), | |||
user_defined_metadata_prefix: Some(USER_DEFINED_METADATA_HEADER_PREFIX), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: how about using metadata_header_prefix
?
Rust handles this by non_exhaustive attribute. |
ba2a828
to
da24f88
Compare
Thanks @Xuanwo. I've updated the PR with your suggestions. I haven't renamed the variables yet since it sounds like @alamb or @tustvold need to weigh in. Regarding your note about "Add custom metadata integration test", I was hoping this change would test the new functionality. Is it not enough? Regarding multipart upload, I think this is already handled. AWS's client has: pub async fn create_multipart(
&self,
location: &Path,
opts: PutMultipartOpts,
) -> Result<MultipartId> {
let response = self
.request(Method::POST, location)
.query(&[("uploads", "")])
.with_encryption_headers()
.with_attributes(opts.attributes)
.with_tags(opts.tags)
.idempotent(true)
.send()
.await?
.bytes()
.await
.context(CreateMultipartResponseBodySnafu)?;
let response: InitiateMultipartUploadResult =
quick_xml::de::from_reader(response.reader()).context(InvalidMultipartResponseSnafu)?;
Ok(response.upload_id)
} The |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
THank you @criccomini and @Xuanwo for the review
Please ping me if you need another review. I think @tustvold may be back next week and he might get to reviewing this first
Sounds good! I'll check in next week. 😄 |
object_store/src/attributes.rs
Outdated
/// Specifies a user-defined metadata field for the object | ||
/// | ||
/// The String is a user-defined key | ||
Metadata(String), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's likely a silliness that will make no measurable performance difference, but given these will often be static strings, I wonder what you think of making this Cow<'static, str>
if let Some(prefix) = T::HEADER_CONFIG.user_defined_metadata_prefix { | ||
for (key, val) in response.headers() { | ||
if let Some(suffix) = key.as_str().strip_prefix(prefix) { | ||
if let Ok(val_str) = val.to_str() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we should error here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just some minor nits, nice to see the benefits of all the time spent DRYing up the client logic in making this a small change 😄
@tustvold I'm ready for another review here. I think I addressed all of your comments. I had some issues with the |
Gentle nudge here =) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me thank, sorry for the delay
Which issue does this PR close?
Closes #4754
Rationale for this change
We would like to be able to read and write user-defined metadata for AWS, GCP, and Azure. This use-case is mentioned briefly in #4754 before the conversation shifts to tags. It appears that tags were implemented, but user-defined metadata has not yet been.
What changes are included in this PR?
Are there any user-facing changes?
I've added a
Metadata(String)
attribute to theAttribute
enum. Other than that I don't believe there are any user-facing changes.I do not believe there are any breaking changes. The user-facing change (
Attributes::Metadata(String)
) is additive and should be compatible. The one caveat here is I'm not sure how Rust handles enums (open vs. closed). If it handles them the way Java does (i.e. adding new enums can break old code) then this would be a breaking change. I'll leave it to the Rust experts to clarify this.